-
Notifications
You must be signed in to change notification settings - Fork 713
[nrf noup] bluetooth: att: Fix CONFIG_BT_ATT_SENT_CB_AFTER_TX
#3323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's revert the old no-up that's broken and reapply the whole fixed feature as one commit. This will help us keep no-ups manageable and help us verify the fix.
The fix seems dubious. We need to clarify some things. Why is there a && (buf->ref == 1)
? And why does the comment say "might"?
The comment should also be rephrased to more clearly state where the reference was taken, and ideally what the reference is currently "attached to" in a way that makes it easy to show that getting rid of it here is correct.
7bea193
to
6791da9
Compare
…nd_buf" This reverts commit adb8f4e. Signed-off-by: Marek Pieta <[email protected]>
…p_send_pdu" This reverts commit 510e36c. Signed-off-by: Marek Pieta <[email protected]>
… TX is done" This reverts commit d1d9b2d. Signed-off-by: Marek Pieta <[email protected]>
By default, the BLE stack calls sent callback for ATT data when the data is passed to BLE controller for transmission. Enabling this Kconfig option delays calling the sent callback until data transmission is finished by BLE controller (the callback is delayed until receiving the Number of Completed Packets HCI Event). If the ATT sent callback is delayed until data transmission is done by BLE controller, the transmitted buffer may have an additional reference. The reference is used to extend lifetime of the net buffer until the data transmission is confirmed by ACK of the remote. Jira: NCSDK-27422 Jira: NCSDK-28624 Jira: NCSDK-35650 Signed-off-by: Marek Pieta <[email protected]>
6791da9
to
5d62bf5
Compare
Pure rebase |
if (IS_ENABLED(CONFIG_BT_ATT_SENT_CB_AFTER_TX)) { | ||
__ASSERT_NO_MSG(buf->ref <= 2); | ||
|
||
if (buf->ref == 2) { | ||
net_buf_unref(buf); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detecting ATT PDUs by checking buf->ref == 2 is fragile because future changes might introduce other scenarios with extra references for different reasons. I'm not even sure that it's not the case today. Instead of using the ref count as a proxy, I suggest comparing the cb:
if (IS_ENABLED(CONFIG_BT_ATT_SENT_CB_AFTER_TX)) {
bt_conn_tx_cb_t cb = closure_cb(buf->user_data);
if (cb == chan_sent_cb) {
void *user_data = closure_data(buf->user_data);
/* Release the extra reference attached to user_data */
cb(chan->conn, user_data, -ESHUTDOWN);
}
}
But I realized it's a bug that we don't invoke the callbacks here. We should be invoking all the callbacks here just like in process_unack_tx()
. I made a PR upstream to fix this just now:
zephyrproject-rtos/zephyr#97056
I suggest we cherry pick that general fix instead. Can you test that it works for you when it passes the upstream CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bt_l2cap_send_pdu
and bt_l2cap_dyn_chan_send
both seem to demand that ref
counter is 1
for buffers placed on the TX queue. Also it seems that the net_buf_unref
here is expected to remove the last reference (please correct me if I overlooked something).
Using the closure_cb
as in the code snippet above would require to pass the chan_sent_cb
function (through header or using extern
keyword). The function would no longer be static for the att.c
file. I tried to avoid that as it might not be a clean solution.
Invoking the sent callback here might help. I will test your commit locally and let you know if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
bt_l2cap_send_pdu
andbt_l2cap_dyn_chan_send
both seem to demand thatref
counter is1
for buffers placed on the TX queue. Also it seems that thenet_buf_unref
here is expected to remove the last reference (please correct me if I overlooked something).
That's just to protect against user error. We did not foresee that a stack-internal user would keep an extra reference and wait for the callback. We just saw no use for the extra reference and it is indicative of a leaked reference. The stack does not rely on this fact AFAIK. We can remove all these checks. Maybe we should.
Using the
closure_cb
as in the code snippet above would require to pass thechan_sent_cb
function (through header or usingextern
keyword). The function would no longer be static for theatt.c
file. I tried to avoid that as it might not be a clean solution.
I prefer that. We are referring to ATT anyway (to explain the code), so including the header is just being explicit about the dependency.
Invoking the sent callback here might help. I will test your commit locally and let you know if it works.
Thanks! It passed CI checks. Invoking the sent callback is a generalization of my code snippet here, so I'm sure it will work. Let's just look out for unintentional breakage elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed your fix from the upstream PR. As soon as the upstream PR is merged, I will update this PR.
The
delay ATT sent callback until data transmission is done by controller
feature introduces an extranet_buf
reference which is removed on sent callback. Make sure to remove the reference on BLE disconnection to free the buffer as the registered sent callback will not be called.Jira: NCSDK-35650